4 research outputs found

    A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of Quantitative Linguistics on 01 Mar 2020, available online: http://www.tandfonline.com/10.1080/09296174.2020.1732177The aim of this paper is to apply a corpus-based methodology, based on the measure of perplexity, to automatically calculate the cross-lingual language distance between historical periods of three languages. The three historical corpora have been constructed and collected with the closest spelling to the original on a balanced basis of fiction and non-fiction. This methodology has been applied to measure the historical distance of Galician with respect to Portuguese and Spanish, from the Middle Ages to the end of the 20th century, both in original spelling and automatically transcribed spelling. The quantitative results are contrasted with hypotheses extracted from experts in historical linguistics. Results show that Galician and Portuguese are varieties of the same language in the Middle Ages and that Galician converges and diverges with Portuguese and Spanish since the last period of the 19th century. In this process, orthography plays a relevant role. It should be pointed out that the method is unsupervised and can be applied to other languagesThis work has received financial support from DOMINO project [PGC2018-102041-B-I00, MCIU/AEI/FEDER, UE]; eRisk project [RTI2018-093336-B-C21]; the Consellería de Cultura, Educación e Ordenación Universitaria (accreditation 2016-2019, ED431G/08, Consolidation and structuring of Groups with Growth Potential: 745ED431B 2017/39) and the European Regional Development Fund (ERDF)S

    Performance and microbial features of the partial nitritation-anammox process treating fish canning wastewater with variable salt concentrations

    Get PDF
    The partial nitritation-anammox (PN-AMX) process applied to wastewaters with high NaCl concentration was studied until now using simulated media, without considering the effect of organic matter concentration and the shift in microbial populations. This research work presents results on the application of this process to the treatment of saline industrial wastewater. Obtained results indicated that the PN-AMX process has the capability to recover its initial activity after a sudden/acute salt inhibition event (up to 16 g NaCl/L). With a progressive salt concentration increase for 150 days, the PN-AMX process was able to remove the 80% of the nitrogen at 7–9 g NaCl/L. The microbiological data indicated that NaCl and ammonia concentrations and temperature are important factors shaping PN-AMX communities. Thus, the NOB abundance (Nitrospira) decreases with the increase of the salt concentration, while heterotrophic denitrifiers are able to outcompete anammox after a peak of organic matter in the feedingThis work was supported by the Spanish Government through GRANDSEA (CTM2014-55397-JIN) and FISHPOL (CTQ2014-55021-R) projects co-funded by FEDER, and the Chilean Government (CONICYT/FONDAP/15130015). The authors from the USC belong to CRETUS (AGRUP2015/02) and the Galician Competitive Research Group (GRC 2013-032), programs co-funded by FEDERS

    Measuring Language Distance of Isolated European Languages

    Get PDF
    Phylogenetics is a sub-field of historical linguistics whose aim is to classify a group of languages by considering their distances within a rooted tree that stands for their historical evolution. A few European languages do not belong to the Indo-European family or are otherwise isolated in the European rooted tree. Although it is not possible to establish phylogenetic links using basic strategies, it is possible to calculate the distances between these isolated languages and the rest using simple corpus-based techniques and natural language processing methods. The objective of this article is to select some isolated languages and measure the distance between them and from the other European languages, so as to shed light on the linguistic distances and proximities of these controversial languages without considering phylogenetic issues. The experiments were carried out with 40 European languages including six languages that are isolated in their corresponding families: Albanian, Armenian, Basque, Georgian, Greek, and HungarianThis work received financial support from DOMINO project (PGC2018-102041-B-I00, MCIU/AEI/FEDER, UE), eRisk project (RTI2018-093336-B-C21), the Consellería de Cultura, Educación e Ordenación Universitaria (accreditation 2016-2019, ED431G/08, Consolidation and structuring of Groups with Growth Potential: ED431B 2017/39), and the European Regional Development Fund (ERDF)S
    corecore